目的 建立能够预测中药成分肝毒性的定量构效关系(quantitative structure-activity relationship,QSAR)模型。方法 从LTKB数据库和国内文献中收集得到了286个人工合成化合物和62个中药成分, 作为建立预测模型的训练集, 利用简单决策树、随机森林和推进式决策树3种树形算法进行模型构建。为验证模型的预测能力, 选择了22个中药成分(外部测试集)进行肝毒性实验, 然后将实验结果与模型预测的结果进行比较。结果 3种树形算法模型均具有较好的自我预测能力, 模型内部交叉验证(leave-one-out和leave-10%-out)结果都在78%~85%之间;但简单决策树和随机森林算法构建的模型对无肝毒化合物预测的准确率显著低于对有肝毒化合物预测的准确率, 显示了较大的对有肝毒化合物预测的偏爱性;而利用推进式决策树算法所构建模型的预测偏爱性较小, 总体预测能力也较高(准确率82%)。因此选用推进式决策树模型作为最优模型, 对外部测试集中22个中药成分的肝毒性进行预测, 准确率达到73%, 比单用人工合成化合物作为训练集所构建的模型有更高的预测准确率和更小的偏爱性。结论 用人工合成化合物及中药成分作为训练集, 建立了对中药成分的肝毒预测能力较高的推进式决策树模型。
Abstract
OBJECTIVE To build tree models for the prediction of hepatotoxicity of compounds from traditional Chinese medicines (TCM).METHODS Three hundred and forty-eight compounds (256 with hepatotoxicity and 92 without) were collected from various databases and literatures and used as training set to build tree models. Twenty-two compounds identified from TCM were first tested for hepatotoxicity experimentally and then used as test set to evaluate the prediction accuracy of optimal tree models. RESULTS Models built with random forest algorithm had the highest overall predictive accuracy of 85% (leave-one-out), but had much lower accuracy for hepatotoxicity negative compounds compared to hepatotoxicity positive compounds (more positive bias). The model built with boosted decision tree had a similar overall predictive accuracy and a much less bias, and therefore was selected as the optimal model. The prediction accuracy of the 22 test samples was 73% by the optimal model. The optimal model based on the training set containing both synthetic and TCM compounds had less bias than an optimal model based on a training set containing only the synthetic compounds. CONCLUSION Tree models with high predictive accuracy are built based on a training set consisting of both synthetic and TCM compounds.The optimal models can predict the hepatotoxicity of TCM compounds with reasonable accuracy.
关键词
中药成分 /
肝毒性预测 /
推进式决策树 /
定量构效关系(QSAR)
{{custom_keyword}} /
Key words
TCM compound /
hepatotoxicity prediction /
boosted decision tree /
quantitative structure-activity relationship(QSAR)
{{custom_keyword}} /
中图分类号:
R913
{{custom_clc.code}}
({{custom_clc.text}})
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] ARROWSMITH J. Trial watch: Phase III and submission failures: 2007-2010[J]. Nat Rev Drug Discov, 2011, 10(2): 87-87.[2] PUGH A J, BARVE A J, FALKNER K, et al. Drug-induced hepatotoxicity or drug-induced liver injury[J]. Clin Liver Dis, 2009, 13(2): 277-294.[3] MCRAE C A, AGARWAL K, MUTIMER D, et al. Hepatitis associated with Chinese herbs[J]. Eur J Gastroenterol Hepatol, 2002, 14(5): 559-562.[4] FURUKAWA M, KASAJIMA S, NAKAMURA Y, et al. Toxic hepatitis induced by show-wu-pian, a Chinese herbal preparation[J]. Intern Med, 2010, 49(15): 1537-1540.[5] PAN D S, FAN Y M, LI B. Advances of toxicogenomics in research on hepatotoxicity and nephrotoxicity [J]. Chin J Biologicals(中国生物制品学杂志), 2012, 25(3): 383-385.[6] ZHANG T B. Research status of toxicogenomics and its influence on the future development of toxicology[J]. J Toxicol(毒理学杂志), 2005, 19(2): 83-88.[7] FOURCHES D, BARNES J C, DAY N C, et al. Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species[J]. Chem Res Toxicol, 2010, 23(1): 171-183.[8] EKINS S, WILLIAMS A J, XU J J. A predictive ligand-based bayesian model for human drug-induced liver injury[J]. Drug Metab Dispos, 2010, 38(12): 2302-2308.[9] CRUZ-MONTEAGUDO M, CORDEIRO M N D S, BORGES F. Computational chemistry approach for the early detection of drug-induced idiosyncratic liver toxicity[J]. J Comput Chem, 2008, 29(4): 533-549.[10] CHENG A, DIXON S L. In silico models for the prediction of dose-dependent human hepatotoxicity[J]. J Comput Aided Mol Des, 2003, 17(12): 811-823.[11] EGAN W J, ZLOKARNIK G, GROOTENHUIS P D J. In silico prediction of drug safety: despite progress there is abundant room for improvement[J]. Drug Discov Today Technol, 2004, 1(4): 381-387.[12] CHEN M, VIJAY V, SHI Q, et al. FDA-approved drug labeling for the study of drug-induced liver injury[J]. Drug Discov Today, 2011, 16(15-16): 697-703.[13] HONG H, XIE Q, GE W, et al. Mold(2), molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics[J]. J Chem Inf Model, 2008, 48(7): 1337-1344.[14] HO T K. Random Decision Forest[C]. 3rd Int'l Conf. on Document Analysis and Recognition. Montreal,Canada. 1995: 278-282.[15] SCHAPIRE R E. The Strength of Weak Learnability[J]. Mach Learn, 1990, 5(2): 197-227.[16] ZHANG Q, ZHOU Q, JIN R M, et al. Methodological study on cell toxicity with liver and kidney co-culture system[J]. Chin J Inf Tradit Chin Med(中国中医药信息杂志), 2011, 18(11): 36-38.[17] ZHANG Q, ZHOU Q, JIN R M, et al. Preliminary study on hepatotoxicity and nephrotoxicity induced by rutaecarpine [J]. Chin J Exp Tradit Med Form(中国实验方剂学杂志), 2011, 17(8): 221-225.[18] ZHANG Q. Study the evaluation system of hepatic toxic components in traditional Chinese medicine on the cellular level[D]. Shanghai: Shanghai University of Traditional Chinese Medicine, 2011.[19] ZHU Y L, YE Z G. Computational toxicology and its application in toxicity study of traditional Chinese medicine[J]. Chin J New Drugs(中国新药杂志), 2011, 20(24): 2424-2429.[20] WANG T, ZHU C C, CHEN C B. The application of CoMFA in the composition of natural pharmacy[J]. Chin Pharm J (中国药学杂志), 2006, 41(24):1844-1846.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究发展计划(973计划)资助项目(2009CB522807);国家自然科学基金资助项目(81173652)
{{custom_fund}}